## 2021-05-29 14:42:34 INFO::Number of metabolite-protein interactions: 172620
## 2021-05-29 14:42:34 INFO::Number of metabolite-metabolite interactions: 2475
## 2021-05-29 14:42:34 INFO::Number of protein-protein interactions: 402396

Introduction

This R package is part of our publication called “Immuno metabolome atlas”. It is able to construct immune system-related protein-metabolite interaction networks. The package features an RShiny application, but also a set of tools for constructing graphs and identifying important biological processes.

Installation

Use the devtools package to install our package and dependencies.

install.packages("devtools")
devtools::install_github("vanhasseltlab/ImmuneMetAtlas")

Getting Started

Setting up the configuration file

To link the Python and R scripts, a configuration file is used, called config.yaml. This file needs to contain the following fields:

  • folder: Path where data files are going to be stored
  • relative_path: True/False. Is the folder given relative to the working directory?
  • GO_ID: Gene Ontology identifier of which all offspring terms are used as a filter

By default, the following settings are used.

folder: Data
relative_path: True
GO_ID: GO:0002376

Preprocessing data

Before using this package, you will need to obtain data using our Python scripts. These are located in the Python folder and only have to be executed once. The scripts will download and extract all data to the given folder in the configuration file. Preprocessing is done by running the following code:

library(ImmunoMet)
run_preprocessing("path/to/config.yaml")

Start the RShiny app

library(ImmunoMet)
load_data(config = "path/to/config.yaml")
run_shiny()

Non-Shiny usages

Example, static graph

library(ImmunoMet)
load_data(config = "path/to/config.yaml")
graph <- example_graph()
plot(graph)

To get a similar interactive graph as in the Shiny app, you can convert the graph to a Plotly object.

to_plotly(graph)

Advanced usages

To function get_graph() is the main function for obtaining a graph-structure representing your search. It takes the following parameters, see the documentation for additional descriptions and examples.

##         Argument       Default
## 1         filter          <NA>
## 2     neighbours             0
## 3 max_neighbours           Inf
## 4         simple         FALSE
## 5    omit_lipids         FALSE
## 6           type Gene Ontology
## 7    search_mode     Interacts
## 8        verbose          TRUE
##                                                                                                                                                                                                                                                                                                       Description
## 1                                                                                                                                                                                                                                                        The search term for building the graph. Cannot be empty.
## 2                                                                                                                                                                                                                                        Integer value representing the number of neighbours (steps) to be found.
## 3                                                                                                                                                                                                                     Maximum number of edges a node allowed. Can be used to remove super hubs e.g. Water or ATP.
## 4                                                                                                                                                                                               Returns a barebone igraph object with only identifiers and names. Useful when only the graph structure is needed.
## 5                                                                                                                                                                                                                                   Removes any metabolite with the superclass 'Lipids and lipid-like molecules'.
## 6                                                                                                                                                             Type of the search filter. Can be one of the following: `Gene Ontology`, `Metabolite/Proteins`, `Pathways`, `GO Simple`, `Superclasses`, `Classes`.
## 7 Determines how interactions are found. `Interacts` (default) will find all possible interactions, while `Between` will only find interactions between the given metabolites / proteins. `Shortest Path` will find the shortest route between the metabolites / proteins given, including indirect interactions.
## 8                                                                                                                                                                                                  If `TRUE` prints a small summary of the graph including the number of metabolites / proteins and interactions.

If the argument simple = FALSE (default), metadata is included in the returned igraph object. This includes the following:

##         Name
## 1         id
## 2       name
## 3  closeness
## 4         go
## 5       type
## 6      color
## 7     enzyme
## 8   cofactor
## 9 confidence
##                                                                                                                                                          Description
## 1                                                                                                                           Identifier of the protein or metabolite.
## 2                                                                                                                            Full name of the protein or metabolite.
## 3                                                                                       Harmonic closeness score of the node. Indicates its importance in the graph.
## 4                                                                                                                            Gene Ontologies associated to the node.
## 5                                                                                Type of the node, can be either `Metabolite`, `Protein`, `Enzyme` or `Transporter`.
## 6                                                                                                                                                  Color of the node
## 7                                                                                                                              Enzyme code if the node is an enzyme.
## 8                                                                                                         Metabolite ID that functions as a cofactor to the protein.
## 9 Confidence (0-1000) of the interaction. Only for protein-protein interactions this number is representive. All other interactions have a confidence score of 1000.

However, when simple = TRUE, only the id, name, and confidence are stored. Several helpers functions exist that can be chained to obtain metadata. These have support for the dplyr pipe notation. A complete example that mimics the example_graph() function is showed below. However, optional helper functions exist, as mentioned in the table below the example.

graph <- get_graph("microglial cell activation", simple = TRUE) %>%
  add_closeness() %>%
  add_gos() %>%
  add_metadata() %>%
  add_node_types() %>%
  add_vertice_colors() %>%
  add_layout()
Helper functions for chaining calculations and (meta)data
Name Description
add_closeness Calculates the Harmonic Closeness per node to determine it’s importance in the graph
add_gos Calculates p-values for each Gene Ontology present in the current graph
add_metadata Adds vertice metadata about enzymes, pathways, and (super)classes
add_node_types Adds types about each node
add_vertice_colors Adds colors to each type of node and edge.
add_layout Calculates a Fruchterman-Reingold layout so each node is placed visually pleasing.
add_communities Uses the Leiden algorithm to identify communities within the current graph.
remove_unconnected Removes any node that has no interaction with other nodes.
metabolite_go_graph Convert the graph to a graph where Gene Ontologies are represented by nodes. Requires add_gos to be run first.